Efficient Sequential Pattern Mining Algorithms
نویسندگان
چکیده
Sequential pattern mining is a heavily researched area in the field of data mining with wide variety of applications. The task of discovering frequent sequences is challenging, because the algorithm needs to process a combinatorially explosive number of possible sequences. Most of the methods dealing with the sequential pattern mining problem are based on the approach of the traditional task of itemset mining, because the former can be interpreted as the generalization of the latter. Several algorithms use a level-wise “candidate generate and test” approach, while others use projected databases to discover the frequent sequences. In this paper a classification of the well-known sequence mining algorithm is presented. Because each algorithm has its own advantages and drawbacks regarding the execution time and the memory requirements, and the exact aim of the algorithms differs as well, thus an exact ranking of the methods is omitted. A basic level-wise algorithm, the GSP is described in detail. Because the level-wise algorithms need less memory in general than the projection-based ones, an efficient implementation of the GSP algorithm is also suggested. Two novel methods, the Bitmap-based GSP (BGSP) and the SM-Tree (State Machine-Tree) algorithms are presented as an enhancement of the GSP-based sequential pattern mining approach. Key-Words: Data mining, Sequential pattern mining, GSP algorithm, Itemset discovering, Apriori algorithm
منابع مشابه
SPMLS : An Efficient Sequential Pattern Mining Algorithm with candidate Generation and Frequency Testing
Sequential pattern mining is a fundamental and essential field of data mining because of its extensive scope of applications spanning from the forecasting the user shopping patterns, and scientific discoveries. The objective is to discover frequently appeared sequential patterns in given set of sequences. Now-a-days, many studies have contributed to the efficiency of sequential pattern mining a...
متن کاملEfficiently Mining Closed Subsequences with Gap Constraints
Mining frequent subsequence patterns from sequence databases is a typical data mining problem and various efficient sequential pattern mining algorithms have been proposed. In many problem domains (e.g, biology), the frequent subsequences confined by the predefined gap requirements are more meaningful than the general sequential patterns. In this paper we re-examine the closed sequential patter...
متن کاملComparison of Efficient Algorithms for Sequence Generation in Data Mining
Data mining is the method or the movement of analyzing data from different perspectives and summarizing it into useful information. There are several major data mining techniques that have been developed and are used in the data mining projects which include association, classification, clustering, sequential patterns, prediction and decision tree. Among different tasks in data mining, sequenti...
متن کاملEfficient Analysis of Pattern and Association Rule Mining Approaches
The process of data mining produces various patterns from a given data source. The most recognized data mining tasks are the process of discovering frequent itemsets, frequent sequential patterns, frequent sequential rules and frequent association rules. Numerous efficient algorithms have been proposed to do the above processes. Frequent pattern mining has been a focused topic in data mining re...
متن کاملMining Compressed Repetitive Gapped Sequential Patterns Efficiently
Mining frequent sequential patterns from sequence databases has been a central research topic in data mining and various efficient mining sequential patterns algorithms have been proposed and studied. Recently, in many problem domains (e.g, program execution traces), a novel sequential pattern mining research, called mining repetitive gapped sequential patterns, has attracted the attention of m...
متن کاملData Mining in Sequential Pattern for Asynchronous Periodic Patterns
Data mining is becoming an increasingly important tool to transform enormous data into useful information. Mining periodic patterns in temporal dataset plays an important role in data mining and knowledge discovery tasks. This paper presents, design and development of software for sequential pattern mining for asynchronous periodic patterns in temporal database. Comparative study of various alg...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005